Overview

Dataset statistics

Number of variables16
Number of observations560129
Missing cells0
Missing cells (%)0.0%
Duplicate rows14188
Duplicate rows (%)2.5%
Total size in memory65.2 MiB
Average record size in memory122.0 B

Variable types

DateTime2
Numeric9
Categorical4
Text1

Alerts

Dataset has 14188 (2.5%) duplicate rowsDuplicates
subcategory has a high cardinality: 53 distinct valuesHigh cardinality
category is highly overall correlated with subcategoryHigh correlation
deplaned_passenger is highly overall correlated with enplaned_passengerHigh correlation
enplaned_passenger is highly overall correlated with deplaned_passengerHigh correlation
incident_no is highly overall correlated with incident_yearHigh correlation
incident_year is highly overall correlated with incident_noHigh correlation
latitude is highly overall correlated with neighborhoodHigh correlation
longitude is highly overall correlated with neighborhoodHigh correlation
max_temperature is highly overall correlated with min_temperatureHigh correlation
min_temperature is highly overall correlated with max_temperatureHigh correlation
neighborhood is highly overall correlated with latitude and 1 other fieldsHigh correlation
subcategory is highly overall correlated with categoryHigh correlation
precipitation has 461640 (82.4%) zerosZeros

Reproduction

Analysis started2024-07-17 14:36:00.165998
Analysis finished2024-07-17 14:37:08.984351
Duration1 minute and 8.82 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

Distinct2382
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size8.5 MiB
Minimum2018-01-01 00:00:00
Maximum2024-07-09 00:00:00
2024-07-17T07:37:09.121157image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:09.272911image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1440
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size8.5 MiB
Minimum2024-07-17 00:00:00
Maximum2024-07-17 23:59:00
2024-07-17T07:37:09.419049image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:09.571832image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

incident_year
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2020.6606
Minimum2018
Maximum2024
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 MiB
2024-07-17T07:37:09.692684image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum2018
5-th percentile2018
Q12019
median2021
Q32022
95-th percentile2024
Maximum2024
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.8921166
Coefficient of variation (CV)0.00093638518
Kurtosis-1.23106
Mean2020.6606
Median Absolute Deviation (MAD)2
Skewness0.079722469
Sum1.1318306 × 109
Variance3.5801054
MonotonicityIncreasing
2024-07-17T07:37:09.803733image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2018 97239
17.4%
2019 94823
16.9%
2022 90138
16.1%
2023 86866
15.5%
2021 83388
14.9%
2020 73910
13.2%
2024 33765
 
6.0%
ValueCountFrequency (%)
2018 97239
17.4%
2019 94823
16.9%
2020 73910
13.2%
2021 83388
14.9%
2022 90138
16.1%
2023 86866
15.5%
2024 33765
 
6.0%
ValueCountFrequency (%)
2024 33765
 
6.0%
2023 86866
15.5%
2022 90138
16.1%
2021 83388
14.9%
2020 73910
13.2%
2019 94823
16.9%
2018 97239
17.4%

incident_day
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.5 MiB
Friday
87435 
Saturday
81594 
Wednesday
81358 
Thursday
78490 
Monday
78473 
Other values (2)
152779 

Length

Max length9
Median length8
Mean length7.1448059
Min length6

Characters and Unicode

Total characters4002013
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMonday
2nd rowMonday
3rd rowMonday
4th rowMonday
5th rowMonday

Common Values

ValueCountFrequency (%)
Friday 87435
15.6%
Saturday 81594
14.6%
Wednesday 81358
14.5%
Thursday 78490
14.0%
Monday 78473
14.0%
Tuesday 76997
13.7%
Sunday 75782
13.5%

Length

2024-07-17T07:37:09.936918image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-17T07:37:10.078361image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
ValueCountFrequency (%)
friday 87435
15.6%
saturday 81594
14.6%
wednesday 81358
14.5%
thursday 78490
14.0%
monday 78473
14.0%
tuesday 76997
13.7%
sunday 75782
13.5%

Most occurring characters

ValueCountFrequency (%)
a 641723
16.0%
d 641487
16.0%
y 560129
14.0%
u 312863
7.8%
r 247519
 
6.2%
e 239713
 
6.0%
s 236845
 
5.9%
n 235613
 
5.9%
S 157376
 
3.9%
T 155487
 
3.9%
Other values (7) 573258
14.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4002013
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 641723
16.0%
d 641487
16.0%
y 560129
14.0%
u 312863
7.8%
r 247519
 
6.2%
e 239713
 
6.0%
s 236845
 
5.9%
n 235613
 
5.9%
S 157376
 
3.9%
T 155487
 
3.9%
Other values (7) 573258
14.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4002013
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 641723
16.0%
d 641487
16.0%
y 560129
14.0%
u 312863
7.8%
r 247519
 
6.2%
e 239713
 
6.0%
s 236845
 
5.9%
n 235613
 
5.9%
S 157376
 
3.9%
T 155487
 
3.9%
Other values (7) 573258
14.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4002013
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 641723
16.0%
d 641487
16.0%
y 560129
14.0%
u 312863
7.8%
r 247519
 
6.2%
e 239713
 
6.0%
s 236845
 
5.9%
n 235613
 
5.9%
S 157376
 
3.9%
T 155487
 
3.9%
Other values (7) 573258
14.3%

incident_no
Real number (ℝ)

HIGH CORRELATION 

Distinct510613
Distinct (%)91.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0888726 × 108
Minimum0
Maximum9.81172 × 108
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size8.5 MiB
2024-07-17T07:37:10.242699image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.8044113 × 108
Q11.90685 × 108
median2.1022944 × 108
Q32.2609293 × 108
95-th percentile2.4010689 × 108
Maximum9.81172 × 108
Range9.81172 × 108
Interquartile range (IQR)35407928

Descriptive statistics

Standard deviation19227070
Coefficient of variation (CV)0.092045199
Kurtosis17.225027
Mean2.0888726 × 108
Median Absolute Deviation (MAD)19282543
Skewness0.45126775
Sum1.1700381 × 1014
Variance3.696802 × 1014
MonotonicityNot monotonic
2024-07-17T07:37:10.405176image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
190202001 229
 
< 0.1%
106000012 15
 
< 0.1%
210504882 14
 
< 0.1%
210862561 14
 
< 0.1%
200464246 13
 
< 0.1%
210737328 13
 
< 0.1%
206093035 11
 
< 0.1%
100000000 11
 
< 0.1%
180084104 10
 
< 0.1%
200734544 10
 
< 0.1%
Other values (510603) 559789
99.9%
ValueCountFrequency (%)
0 2
< 0.1%
1808670 1
< 0.1%
1813494 1
< 0.1%
1819855 1
< 0.1%
1819873 1
< 0.1%
1831758 1
< 0.1%
1831875 1
< 0.1%
2000558 1
< 0.1%
2001461 1
< 0.1%
2002056 1
< 0.1%
ValueCountFrequency (%)
981171996 1
< 0.1%
981081925 1
< 0.1%
911272570 1
< 0.1%
793282725 1
< 0.1%
782312915 1
< 0.1%
773809399 1
< 0.1%
700013570 1
< 0.1%
270762961 1
< 0.1%
251030935 1
< 0.1%
250328844 1
< 0.1%

category
Categorical

HIGH CORRELATION 

Distinct29
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
Larceny Theft
258689 
Vandalism
53633 
Assault
46011 
Motor Vehicle Theft
45837 
Burglary
42970 
Other values (24)
112989 

Length

Max length44
Median length13
Mean length12.263047
Min length4

Characters and Unicode

Total characters6868888
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAssault
2nd rowMiscellaneous Investigation
3rd rowMiscellaneous Investigation
4th rowFraud
5th rowMiscellaneous Investigation

Common Values

ValueCountFrequency (%)
Larceny Theft 258689
46.2%
Vandalism 53633
 
9.6%
Assault 46011
 
8.2%
Motor Vehicle Theft 45837
 
8.2%
Burglary 42970
 
7.7%
Lost Property 24509
 
4.4%
Fraud 21519
 
3.8%
Robbery 14588
 
2.6%
Drug Offense 12500
 
2.2%
Disorderly Conduct 9408
 
1.7%
Other values (19) 30465
 
5.4%

Length

2024-07-17T07:37:10.569247image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
theft 304565
30.6%
larceny 258689
26.0%
vandalism 53633
 
5.4%
vehicle 46642
 
4.7%
assault 46011
 
4.6%
motor 45876
 
4.6%
burglary 42970
 
4.3%
property 25959
 
2.6%
lost 24509
 
2.5%
fraud 21519
 
2.2%
Other values (34) 124608
12.5%

Most occurring characters

ValueCountFrequency (%)
e 790905
 
11.5%
r 544340
 
7.9%
a 517698
 
7.5%
t 499842
 
7.3%
434852
 
6.3%
n 391183
 
5.7%
y 355165
 
5.2%
f 354315
 
5.2%
h 351207
 
5.1%
c 337234
 
4.9%
Other values (35) 2292147
33.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6868888
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 790905
 
11.5%
r 544340
 
7.9%
a 517698
 
7.5%
t 499842
 
7.3%
434852
 
6.3%
n 391183
 
5.7%
y 355165
 
5.2%
f 354315
 
5.2%
h 351207
 
5.1%
c 337234
 
4.9%
Other values (35) 2292147
33.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6868888
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 790905
 
11.5%
r 544340
 
7.9%
a 517698
 
7.5%
t 499842
 
7.3%
434852
 
6.3%
n 391183
 
5.7%
y 355165
 
5.2%
f 354315
 
5.2%
h 351207
 
5.1%
c 337234
 
4.9%
Other values (35) 2292147
33.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6868888
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 790905
 
11.5%
r 544340
 
7.9%
a 517698
 
7.5%
t 499842
 
7.3%
434852
 
6.3%
n 391183
 
5.7%
y 355165
 
5.2%
f 354315
 
5.2%
h 351207
 
5.1%
c 337234
 
4.9%
Other values (35) 2292147
33.4%

subcategory
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.8 MiB
Larceny - From Vehicle
144548 
Larceny Theft - Other
60647 
Vandalism
53633 
Motor Vehicle Theft
45301 
Simple Assault
29177 
Other values (48)
226823 

Length

Max length40
Median length29
Mean length18.313098
Min length4

Characters and Unicode

Total characters10257697
Distinct characters47
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSimple Assault
2nd rowMiscellaneous Investigation
3rd rowMiscellaneous Investigation
4th rowFraud
5th rowMiscellaneous Investigation

Common Values

ValueCountFrequency (%)
Larceny - From Vehicle 144548
25.8%
Larceny Theft - Other 60647
10.8%
Vandalism 53633
 
9.6%
Motor Vehicle Theft 45301
 
8.1%
Simple Assault 29177
 
5.2%
Lost Property 24509
 
4.4%
Fraud 21473
 
3.8%
Burglary - Other 17390
 
3.1%
Larceny Theft - Shoplifting 17056
 
3.0%
Aggravated Assault 16831
 
3.0%
Other values (43) 129564
23.1%

Length

2024-07-17T07:37:10.737669image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
302812
18.1%
larceny 245239
14.7%
vehicle 204640
12.2%
from 171637
10.3%
theft 156709
9.4%
other 85765
 
5.1%
vandalism 53633
 
3.2%
assault 46011
 
2.8%
motor 45876
 
2.7%
burglary 42970
 
2.6%
Other values (56) 315367
18.9%

Most occurring characters

ValueCountFrequency (%)
1110530
 
10.8%
e 1102810
 
10.8%
r 808515
 
7.9%
a 583868
 
5.7%
t 525553
 
5.1%
i 496281
 
4.8%
c 490169
 
4.8%
l 482828
 
4.7%
h 464418
 
4.5%
o 453370
 
4.4%
Other values (37) 3739355
36.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 10257697
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1110530
 
10.8%
e 1102810
 
10.8%
r 808515
 
7.9%
a 583868
 
5.7%
t 525553
 
5.1%
i 496281
 
4.8%
c 490169
 
4.8%
l 482828
 
4.7%
h 464418
 
4.5%
o 453370
 
4.4%
Other values (37) 3739355
36.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 10257697
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1110530
 
10.8%
e 1102810
 
10.8%
r 808515
 
7.9%
a 583868
 
5.7%
t 525553
 
5.1%
i 496281
 
4.8%
c 490169
 
4.8%
l 482828
 
4.7%
h 464418
 
4.5%
o 453370
 
4.4%
Other values (37) 3739355
36.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 10257697
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1110530
 
10.8%
e 1102810
 
10.8%
r 808515
 
7.9%
a 583868
 
5.7%
t 525553
 
5.1%
i 496281
 
4.8%
c 490169
 
4.8%
l 482828
 
4.7%
h 464418
 
4.5%
o 453370
 
4.4%
Other values (37) 3739355
36.5%
Distinct556
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.5 MiB
2024-07-17T07:37:10.933459image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Length

Max length80
Median length59
Mean length30.750061
Min length4

Characters and Unicode

Total characters17224001
Distinct characters72
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)< 0.1%

Sample

1st rowBattery
2nd rowMiscellaneous Investigation
3rd rowMiscellaneous Investigation
4th rowFraudulent Game or Trick, Obtaining Money or Property
5th rowMiscellaneous Investigation
ValueCountFrequency (%)
theft 246822
 
10.5%
vehicle 216757
 
9.2%
from 165590
 
7.0%
950 161006
 
6.8%
locked 124264
 
5.3%
property 114810
 
4.9%
other 78019
 
3.3%
stolen 59176
 
2.5%
mischief 51726
 
2.2%
malicious 51726
 
2.2%
Other values (605) 1089978
46.2%
2024-07-17T07:37:11.324309image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1800648
 
10.5%
e 1669848
 
9.7%
o 993693
 
5.8%
t 990411
 
5.8%
r 953957
 
5.5%
, 848980
 
4.9%
i 828843
 
4.8%
l 698378
 
4.1%
h 662204
 
3.8%
c 661837
 
3.8%
Other values (62) 7115202
41.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 17224001
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1800648
 
10.5%
e 1669848
 
9.7%
o 993693
 
5.8%
t 990411
 
5.8%
r 953957
 
5.5%
, 848980
 
4.9%
i 828843
 
4.8%
l 698378
 
4.1%
h 662204
 
3.8%
c 661837
 
3.8%
Other values (62) 7115202
41.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 17224001
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1800648
 
10.5%
e 1669848
 
9.7%
o 993693
 
5.8%
t 990411
 
5.8%
r 953957
 
5.5%
, 848980
 
4.9%
i 828843
 
4.8%
l 698378
 
4.1%
h 662204
 
3.8%
c 661837
 
3.8%
Other values (62) 7115202
41.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 17224001
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1800648
 
10.5%
e 1669848
 
9.7%
o 993693
 
5.8%
t 990411
 
5.8%
r 953957
 
5.5%
, 848980
 
4.9%
i 828843
 
4.8%
l 698378
 
4.1%
h 662204
 
3.8%
c 661837
 
3.8%
Other values (62) 7115202
41.3%

neighborhood
Categorical

HIGH CORRELATION 

Distinct41
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.5 MiB
Financial District/South Beach
74375 
Mission
56880 
Tenderloin
44291 
South of Market
37820 
Bayview Hunters Point
 
31208
Other values (36)
315555 

Length

Max length30
Median length18
Mean length14.849129
Min length6

Characters and Unicode

Total characters8317428
Distinct characters46
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMission
2nd rowTenderloin
3rd rowTenderloin
4th rowOuter Richmond
5th rowFinancial District/South Beach

Common Values

ValueCountFrequency (%)
Financial District/South Beach 74375
 
13.3%
Mission 56880
 
10.2%
Tenderloin 44291
 
7.9%
South of Market 37820
 
6.8%
Bayview Hunters Point 31208
 
5.6%
North Beach 21164
 
3.8%
Western Addition 17443
 
3.1%
Marina 17044
 
3.0%
Hayes Valley 16756
 
3.0%
Sunset/Parkside 15987
 
2.9%
Other values (31) 227161
40.6%

Length

2024-07-17T07:37:11.483745image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
beach 95539
 
8.6%
financial 74375
 
6.7%
district/south 74375
 
6.7%
mission 72859
 
6.6%
market 53387
 
4.8%
of 49242
 
4.4%
tenderloin 44291
 
4.0%
hill 40366
 
3.6%
south 37820
 
3.4%
bayview 31208
 
2.8%
Other values (46) 537337
48.4%

Most occurring characters

ValueCountFrequency (%)
i 828313
 
10.0%
e 640225
 
7.7%
n 622019
 
7.5%
t 588128
 
7.1%
a 584695
 
7.0%
550670
 
6.6%
o 503015
 
6.0%
s 466977
 
5.6%
r 429666
 
5.2%
c 314779
 
3.8%
Other values (36) 2788941
33.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 8317428
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 828313
 
10.0%
e 640225
 
7.7%
n 622019
 
7.5%
t 588128
 
7.1%
a 584695
 
7.0%
550670
 
6.6%
o 503015
 
6.0%
s 466977
 
5.6%
r 429666
 
5.2%
c 314779
 
3.8%
Other values (36) 2788941
33.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 8317428
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 828313
 
10.0%
e 640225
 
7.7%
n 622019
 
7.5%
t 588128
 
7.1%
a 584695
 
7.0%
550670
 
6.6%
o 503015
 
6.0%
s 466977
 
5.6%
r 429666
 
5.2%
c 314779
 
3.8%
Other values (36) 2788941
33.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 8317428
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 828313
 
10.0%
e 640225
 
7.7%
n 622019
 
7.5%
t 588128
 
7.1%
a 584695
 
7.0%
550670
 
6.6%
o 503015
 
6.0%
s 466977
 
5.6%
r 429666
 
5.2%
c 314779
 
3.8%
Other values (36) 2788941
33.5%

latitude
Real number (ℝ)

HIGH CORRELATION 

Distinct11405
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.770708
Minimum37.707988
Maximum37.829991
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 MiB
2024-07-17T07:37:11.622575image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum37.707988
5-th percentile37.721577
Q137.757121
median37.77749
Q337.788269
95-th percentile37.803823
Maximum37.829991
Range0.1220025
Interquartile range (IQR)0.031148193

Descriptive statistics

Standard deviation0.024325427
Coefficient of variation (CV)0.00064402888
Kurtosis-0.24542151
Mean37.770708
Median Absolute Deviation (MAD)0.012579433
Skewness-0.73442557
Sum21156469
Variance0.00059172638
MonotonicityNot monotonic
2024-07-17T07:37:11.780748image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.7892407 29316
 
5.2%
37.76126807 3684
 
0.7%
37.78456014 3300
 
0.6%
37.72694991 2770
 
0.5%
37.78640961 2261
 
0.4%
37.80549665 1924
 
0.3%
37.78404444 1829
 
0.3%
37.76505134 1795
 
0.3%
37.78574399 1694
 
0.3%
37.78445273 1616
 
0.3%
Other values (11395) 509940
91.0%
ValueCountFrequency (%)
37.70798826 10
 
< 0.1%
37.70801926 3
 
< 0.1%
37.70802018 66
< 0.1%
37.7080307 1
 
< 0.1%
37.70805761 24
 
< 0.1%
37.70816803 1
 
< 0.1%
37.7082148 6
 
< 0.1%
37.70825596 51
< 0.1%
37.70830771 1
 
< 0.1%
37.70831127 79
< 0.1%
ValueCountFrequency (%)
37.82999075 92
< 0.1%
37.82999039 8
 
< 0.1%
37.82979202 1
 
< 0.1%
37.82979158 29
 
< 0.1%
37.8296814 1
 
< 0.1%
37.8296623 43
< 0.1%
37.82961662 33
 
< 0.1%
37.82961655 1
 
< 0.1%
37.82954858 107
< 0.1%
37.82954788 7
 
< 0.1%

longitude
Real number (ℝ)

HIGH CORRELATION 

Distinct11051
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-122.42379
Minimum-122.51142
Maximum-122.36374
Zeros0
Zeros (%)0.0%
Negative560129
Negative (%)100.0%
Memory size8.5 MiB
2024-07-17T07:37:11.929310image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum-122.51142
5-th percentile-122.48045
Q1-122.43471
median-122.41753
Q3-122.4058
95-th percentile-122.39142
Maximum-122.36374
Range0.14768219
Interquartile range (IQR)0.028915118

Descriptive statistics

Standard deviation0.026550604
Coefficient of variation (CV)-0.00021687455
Kurtosis1.1224504
Mean-122.42379
Median Absolute Deviation (MAD)0.013886066
Skewness-1.1627429
Sum-68573113
Variance0.00070493456
MonotonicityNot monotonic
2024-07-17T07:37:12.084601image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-122.4007224 29316
 
5.2%
-122.4167966 3684
 
0.7%
-122.407337 3300
 
0.6%
-122.4760395 2770
 
0.5%
-122.4080362 2261
 
0.4%
-122.4220068 1924
 
0.3%
-122.4037118 1829
 
0.3%
-122.419669 1795
 
0.3%
-122.405831 1694
 
0.3%
-122.4084932 1616
 
0.3%
Other values (11041) 509940
91.0%
ValueCountFrequency (%)
-122.5114212 7
 
< 0.1%
-122.5112949 1442
0.3%
-122.5112915 30
 
< 0.1%
-122.5112839 2
 
< 0.1%
-122.511055 5
 
< 0.1%
-122.5109253 10
 
< 0.1%
-122.5103413 20
 
< 0.1%
-122.51017 5
 
< 0.1%
-122.5101688 73
 
< 0.1%
-122.5100403 6
 
< 0.1%
ValueCountFrequency (%)
-122.363739 6
 
< 0.1%
-122.3637428 119
< 0.1%
-122.3662872 1
 
< 0.1%
-122.3679199 1
 
< 0.1%
-122.3680954 2
 
< 0.1%
-122.3682327 1
 
< 0.1%
-122.36843 57
< 0.1%
-122.3684311 2
 
< 0.1%
-122.3690338 3
 
< 0.1%
-122.3690371 40
 
< 0.1%

precipitation
Real number (ℝ)

ZEROS 

Distinct108
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.055742445
Minimum0
Maximum5.46
Zeros461640
Zeros (%)82.4%
Negative0
Negative (%)0.0%
Memory size8.5 MiB
2024-07-17T07:37:12.237846image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.34
Maximum5.46
Range5.46
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.23694259
Coefficient of variation (CV)4.2506673
Kurtosis164.08547
Mean0.055742445
Median Absolute Deviation (MAD)0
Skewness10.154109
Sum31222.96
Variance0.056141791
MonotonicityNot monotonic
2024-07-17T07:37:12.383935image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 461640
82.4%
0.01 11164
 
2.0%
0.02 9030
 
1.6%
0.04 4418
 
0.8%
0.07 3568
 
0.6%
0.03 3294
 
0.6%
0.06 2585
 
0.5%
0.21 2523
 
0.5%
0.08 2302
 
0.4%
0.3 1846
 
0.3%
Other values (98) 57759
 
10.3%
ValueCountFrequency (%)
0 461640
82.4%
0.01 11164
 
2.0%
0.02 9030
 
1.6%
0.03 3294
 
0.6%
0.04 4418
 
0.8%
0.05 1018
 
0.2%
0.06 2585
 
0.5%
0.07 3568
 
0.6%
0.08 2302
 
0.4%
0.09 1499
 
0.3%
ValueCountFrequency (%)
5.46 230
< 0.1%
4.02 165
< 0.1%
3.15 251
< 0.1%
2.57 213
< 0.1%
2.49 226
< 0.1%
2.22 281
0.1%
1.67 177
< 0.1%
1.6 279
< 0.1%
1.5 192
< 0.1%
1.4 181
< 0.1%

min_temperature
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.698209
Minimum39
Maximum72
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 MiB
2024-07-17T07:37:12.519204image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum39
5-th percentile44
Q149
median52
Q355
95-th percentile59
Maximum72
Range33
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.4715548
Coefficient of variation (CV)0.086493418
Kurtosis0.027618486
Mean51.698209
Median Absolute Deviation (MAD)3
Skewness-0.017587858
Sum28957666
Variance19.994802
MonotonicityNot monotonic
2024-07-17T07:37:12.651422image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
52 51187
 
9.1%
54 48477
 
8.7%
53 45852
 
8.2%
51 44826
 
8.0%
55 43377
 
7.7%
49 41378
 
7.4%
50 40379
 
7.2%
56 34908
 
6.2%
48 32350
 
5.8%
57 27954
 
5.0%
Other values (21) 149441
26.7%
ValueCountFrequency (%)
39 886
 
0.2%
40 1555
 
0.3%
41 2464
 
0.4%
42 5025
 
0.9%
43 8979
 
1.6%
44 15793
2.8%
45 19413
3.5%
46 21090
3.8%
47 26096
4.7%
48 32350
5.8%
ValueCountFrequency (%)
72 236
 
< 0.1%
71 178
 
< 0.1%
69 170
 
< 0.1%
68 243
 
< 0.1%
65 477
 
0.1%
64 1418
 
0.3%
63 633
 
0.1%
62 3243
0.6%
61 3824
0.7%
60 7852
1.4%

max_temperature
Real number (ℝ)

HIGH CORRELATION 

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.009794
Minimum46
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 MiB
2024-07-17T07:37:12.798747image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum46
5-th percentile54
Q159
median63
Q368
95-th percentile77
Maximum100
Range54
Interquartile range (IQR)9

Descriptive statistics

Standard deviation7.1465252
Coefficient of variation (CV)0.11164737
Kurtosis2.048087
Mean64.009794
Median Absolute Deviation (MAD)4
Skewness1.0303219
Sum35853742
Variance51.072823
MonotonicityNot monotonic
2024-07-17T07:37:12.959352image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
63 36108
 
6.4%
61 34655
 
6.2%
62 34600
 
6.2%
60 33739
 
6.0%
64 33541
 
6.0%
66 32963
 
5.9%
65 32494
 
5.8%
59 32449
 
5.8%
67 28280
 
5.0%
58 27314
 
4.9%
Other values (42) 233986
41.8%
ValueCountFrequency (%)
46 206
 
< 0.1%
48 605
 
0.1%
49 939
 
0.2%
50 2558
 
0.5%
51 1546
 
0.3%
52 5007
 
0.9%
53 9126
1.6%
54 12151
2.2%
55 16641
3.0%
56 19785
3.5%
ValueCountFrequency (%)
100 178
 
< 0.1%
98 235
 
< 0.1%
97 236
 
< 0.1%
95 193
 
< 0.1%
94 733
0.1%
93 467
0.1%
92 631
0.1%
91 652
0.1%
90 744
0.1%
89 426
0.1%

deplaned_passenger
Real number (ℝ)

HIGH CORRELATION 

Distinct76
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1815323.4
Minimum70582
Maximum2905562
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 MiB
2024-07-17T07:37:13.113162image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum70582
5-th percentile377145
Q11371431
median2032485
Q32278985
95-th percentile2798782
Maximum2905562
Range2834980
Interquartile range (IQR)907554

Descriptive statistics

Standard deviation715085.85
Coefficient of variation (CV)0.3939165
Kurtosis-0.23766227
Mean1815323.4
Median Absolute Deviation (MAD)360087
Skewness-0.81448697
Sum1.0168153 × 1012
Variance5.1134777 × 1011
MonotonicityNot monotonic
2024-07-17T07:37:13.270069image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2112785 16685
 
3.0%
2392572 8926
 
1.6%
2905562 8880
 
1.6%
2798782 8823
 
1.6%
2889791 8799
 
1.6%
2401192 8590
 
1.5%
2172552 8561
 
1.5%
2850432 8440
 
1.5%
2533342 8417
 
1.5%
2247754 8378
 
1.5%
Other values (66) 465630
83.1%
ValueCountFrequency (%)
70582 4928
0.9%
141790 5754
1.0%
270718 5639
1.0%
367568 5689
1.0%
377145 6203
1.1%
400788 6503
1.2%
418398 6053
1.1%
436216 6381
1.1%
442442 5573
1.0%
522277 6171
1.1%
ValueCountFrequency (%)
2905562 8880
1.6%
2889791 8799
1.6%
2850432 8440
1.5%
2798782 8823
1.6%
2665667 7704
1.4%
2658235 7908
1.4%
2533342 8417
1.5%
2510276 8148
1.5%
2497897 7559
1.3%
2425591 8311
1.5%

enplaned_passenger
Real number (ℝ)

HIGH CORRELATION 

Distinct76
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1804831.8
Minimum68235
Maximum2844325
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.5 MiB
2024-07-17T07:37:13.422204image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Quantile statistics

Minimum68235
5-th percentile369120
Q11360243
median2006413
Q32317631
95-th percentile2763785
Maximum2844325
Range2776090
Interquartile range (IQR)957388

Descriptive statistics

Standard deviation707785.71
Coefficient of variation (CV)0.3921616
Kurtosis-0.26454461
Mean1804831.8
Median Absolute Deviation (MAD)393148
Skewness-0.80844854
Sum1.0109386 × 1012
Variance5.0096061 × 1011
MonotonicityNot monotonic
2024-07-17T07:37:13.572834image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2019052 16685
 
3.0%
2425539 8926
 
1.6%
2779507 8880
 
1.6%
2740357 8823
 
1.6%
2844325 8799
 
1.6%
2317631 8590
 
1.5%
2016637 8561
 
1.5%
2754032 8440
 
1.5%
2419322 8417
 
1.5%
2217062 8378
 
1.5%
Other values (66) 465630
83.1%
ValueCountFrequency (%)
68235 4928
0.9%
144780 5754
1.0%
284401 5639
1.0%
368635 5689
1.0%
369120 6503
1.2%
388129 6203
1.1%
434000 6053
1.1%
463550 5573
1.0%
482446 6381
1.1%
529962 6171
1.1%
ValueCountFrequency (%)
2844325 8799
1.6%
2793918 7704
1.4%
2779507 8880
1.6%
2763785 7908
1.4%
2754032 8440
1.5%
2740357 8823
1.6%
2513125 8148
1.5%
2505020 7559
1.3%
2430076 8311
1.5%
2425539 8926
1.6%

Interactions

2024-07-17T07:37:05.824382image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:55.951606image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:57.183208image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:58.379663image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:59.624936image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:00.945815image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:02.147002image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:03.359502image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:04.562147image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:05.960779image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:56.096754image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:57.315244image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:58.519129image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:59.769787image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:01.078898image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:02.284236image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:03.496273image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:04.696765image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:06.093621image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:56.233239image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:57.446185image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:58.654540image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:59.910000image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:01.212394image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:02.416046image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:03.632993image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:04.831773image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:06.228511image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:56.368763image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:57.577121image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:58.795344image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:00.050648image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:01.343233image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:02.551283image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:03.766382image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:04.966707image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:06.373945image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:56.512609image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:57.715350image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:58.941497image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:00.200202image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:01.482216image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:02.691904image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:03.905559image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:05.111232image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:06.504248image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:56.646154image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:57.845202image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:59.075747image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:00.338026image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:01.607434image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:02.817634image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:04.033200image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:05.274487image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:06.644259image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:56.780862image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:57.977258image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:59.213494image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:00.482073image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:01.740779image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:02.950702image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:04.165700image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:05.413732image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:06.783847image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:56.912638image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:58.108436image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:59.350530image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:00.657342image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:01.869401image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:03.083529image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:04.294707image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:05.549976image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:06.920394image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:57.049021image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:58.244474image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:36:59.493539image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:00.806746image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:02.008897image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:03.225802image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:04.429162image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
2024-07-17T07:37:05.688006image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/

Correlations

2024-07-17T07:37:13.684590image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
categorydeplaned_passengerenplaned_passengerincident_dayincident_noincident_yearlatitudelongitudemax_temperaturemin_temperatureneighborhoodprecipitationsubcategory
category1.000-0.005-0.0050.0270.0170.008-0.0430.010-0.001-0.0020.088-0.0020.924
deplaned_passenger-0.0051.0000.9870.012-0.340-0.3580.0390.0390.1550.2470.036-0.0560.055
enplaned_passenger-0.0050.9871.0000.011-0.317-0.3410.0380.0380.1500.2360.037-0.0480.057
incident_day0.0270.0120.0111.0000.0010.004-0.011-0.005-0.0060.0120.0180.0140.033
incident_no0.017-0.340-0.3170.0011.0000.983-0.011-0.026-0.029-0.0730.0330.0170.112
incident_year0.008-0.358-0.3410.0040.9831.000-0.032-0.032-0.050-0.1030.0480.0230.078
latitude-0.0430.0390.038-0.011-0.011-0.0321.0000.2090.0050.0140.769-0.0050.121
longitude0.0100.0390.038-0.005-0.026-0.0320.2091.0000.0020.0020.679-0.0020.104
max_temperature-0.0010.1550.150-0.006-0.029-0.0500.0050.0021.0000.7070.007-0.4060.012
min_temperature-0.0020.2470.2360.012-0.073-0.1030.0140.0020.7071.0000.010-0.2100.015
neighborhood0.0880.0360.0370.0180.0330.0480.7690.6790.0070.0101.000-0.0000.089
precipitation-0.002-0.056-0.0480.0140.0170.023-0.005-0.002-0.406-0.210-0.0001.0000.008
subcategory0.9240.0550.0570.0330.1120.0780.1210.1040.0120.0150.0890.0081.000

Missing values

2024-07-17T07:37:07.143483image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-07-17T07:37:07.721024image/svg+xmlMatplotlib v3.8.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

incident_dateincident_timeincident_yearincident_dayincident_nocategorysubcategorydescriptionneighborhoodlatitudelongitudeprecipitationmin_temperaturemax_temperaturedeplaned_passengerenplaned_passenger
4062018-01-0100:552018Monday180000172AssaultSimple AssaultBatteryMission37.768473-122.4058280.048.061.02172552.02016637.0
482018-01-0100:002018Monday200421296Miscellaneous InvestigationMiscellaneous InvestigationMiscellaneous InvestigationTenderloin37.778719-122.4147410.048.061.02172552.02016637.0
4532018-01-0100:002018Monday200421296Miscellaneous InvestigationMiscellaneous InvestigationMiscellaneous InvestigationTenderloin37.778719-122.4147410.048.061.02172552.02016637.0
4872018-01-0100:012018Monday200378998FraudFraudFraudulent Game or Trick, Obtaining Money or PropertyOuter Richmond37.781924-122.4870200.048.061.02172552.02016637.0
1232018-01-0107:002018Monday200309028Miscellaneous InvestigationMiscellaneous InvestigationMiscellaneous InvestigationFinancial District/South Beach37.786379-122.3956450.048.061.02172552.02016637.0
3762018-01-0107:002018Monday206034998Larceny TheftLarceny Theft - OtherTheft, Other Property, >$950Bayview Hunters Point37.720930-122.3972710.048.061.02172552.02016637.0
1122018-01-0107:002018Monday206034998Larceny TheftLarceny Theft - OtherTheft, Other Property, >$950Bayview Hunters Point37.720930-122.3972710.048.061.02172552.02016637.0
4142018-01-0100:012018Monday200147248FraudFraudFalse PersonationMission Bay37.772831-122.3913740.048.061.02172552.02016637.0
2772018-01-0105:202018Monday180000570AssaultAggravated AssaultAssault, Aggravated, W/ GunFinancial District/South Beach37.785542-122.3967050.048.061.02172552.02016637.0
5302018-01-0100:002018Monday200118738Larceny TheftTheft From VehicleLicense Plate, StolenMission37.760427-122.4109620.048.061.02172552.02016637.0
incident_dateincident_timeincident_yearincident_dayincident_nocategorysubcategorydescriptionneighborhoodlatitudelongitudeprecipitationmin_temperaturemax_temperaturedeplaned_passengerenplaned_passenger
8678462024-07-0914:202024Tuesday240427890Larceny TheftLarceny - From VehicleTheft, From Locked Vehicle, $200-$950Mission37.769917-122.4215850.056.071.02112785.02019052.0
8679382024-07-0909:402024Tuesday240427135VandalismVandalismMalicious Mischief, Vandalism to PropertyPortola37.720768-122.4040150.056.071.02112785.02019052.0
8679262024-07-0919:302024Tuesday240428713Larceny TheftLarceny - From VehicleTheft, From Locked Vehicle, >$950North Beach37.805618-122.4136050.056.071.02112785.02019052.0
8679152024-07-0909:192024Tuesday240427787Disorderly ConductIntimidationTerrorist ThreatsTenderloin37.785789-122.4129710.056.071.02112785.02019052.0
8678502024-07-0909:252024Tuesday240427072AssaultSimple AssaultBatteryNob Hill37.788433-122.4185560.056.071.02112785.02019052.0
8679442024-07-0911:002024Tuesday240427652BurglaryBurglary - Hot ProwlBurglary, Hot Prowl, Forcible EntryPortola37.733047-122.4096910.056.071.02112785.02019052.0
8679092024-07-0911:002024Tuesday240427260AssaultSimple AssaultBatteryBayview Hunters Point37.729160-122.3924180.056.071.02112785.02019052.0
8678392024-07-0911:502024Tuesday240427511AssaultSimple AssaultBatteryMission37.749271-122.4143370.056.071.02112785.02019052.0
8678912024-07-0911:302024Tuesday240427408AssaultSimple AssaultBatterySouth of Market37.784477-122.4042660.056.071.02112785.02019052.0
8679012024-07-0922:002024Tuesday240428729VandalismVandalismMalicious Mischief, Vandalism to PropertyTenderloin37.780952-122.4119870.056.071.02112785.02019052.0

Duplicate rows

Most frequently occurring

incident_dateincident_timeincident_yearincident_dayincident_nocategorysubcategorydescriptionneighborhoodlatitudelongitudeprecipitationmin_temperaturemax_temperaturedeplaned_passengerenplaned_passenger# duplicates
4982018-03-0707:002018Wednesday190202001Larceny TheftLarceny Theft - OtherTheft, Other Property, >$950Haight Ashbury37.772620-122.4339220.0253.069.02320899.02352210.055
4842018-03-0606:002018Tuesday190202001Larceny TheftLarceny Theft - OtherTheft, Other Property, >$950Haight Ashbury37.772620-122.4339220.0047.066.02320899.02352210.014
4862018-03-0607:002018Tuesday190202001Larceny TheftLarceny Theft - OtherTheft, Other Property, >$950Haight Ashbury37.772620-122.4339220.0047.066.02320899.02352210.010
5232018-03-1001:002018Saturday190202001Larceny TheftLarceny Theft - OtherTheft, Other Property, >$950Haight Ashbury37.772620-122.4339220.0047.058.02320899.02352210.09
18222018-09-0610:302018Thursday186212385Larceny TheftLarceny - From VehicleTheft, From Locked Vehicle, >$950Golden Gate Park37.773304-122.4680240.0055.061.02340188.02302919.09
81862021-08-0903:402021Monday210504882Motor Vehicle TheftMotor Vehicle TheftVehicle, Stolen, Other VehicleTenderloin37.781177-122.4117000.0056.067.01356733.01292158.09
4992018-03-0707:002018Wednesday190202001Larceny TheftLarceny Theft - OtherTheft, Other Property, >$950Mission37.768178-122.4107310.0253.069.02320899.02352210.08
111902022-11-1614:002022Wednesday226217431Larceny TheftLarceny - From VehicleTheft, From Locked Vehicle, >$950Japantown37.785373-122.4313660.0047.069.01808759.01841032.08
7702018-04-1001:002018Tuesday190202001Larceny TheftLarceny Theft - OtherTheft, Other Property, $50-$200Haight Ashbury37.772620-122.4339220.0052.063.02393156.02318680.07
15602018-08-0217:002018Thursday186202897Larceny TheftLarceny - From VehicleTheft, From Locked Vehicle, >$950North Beach37.805185-122.4034360.0052.063.02798782.02740357.07